[00:00.520 --> 00:05.620]  Our agenda will be the first introduction, we will introduce ourselves and what we do,
[00:05.620 --> 00:10.860]  and after that we will look at the Red Team side of the things, and the DevOps side of the things,
[00:10.860 --> 00:17.200]  and lastly we will meet them together in an example, and after that we will look at how
[00:17.200 --> 00:24.920]  can we maintain our automation codebase. So we're both Security Engineer Center and
[00:25.140 --> 00:31.760]  member of Blackbox Security and ITCAT. We're frequently blogging at Trento Tech and we're
[00:31.760 --> 00:38.240]  performing Red Team and Zero-Day Research at our leisure time. So what we're doing is we're
[00:38.240 --> 00:44.340]  planning a Red Team engagement and with that plan we are designing our infrastructure and
[00:44.340 --> 00:50.700]  iterate over it with our team, and once it's stable we're automating our infrastructure design
[00:50.700 --> 00:59.280]  using Terraform, and after that if the operation will take longer than usual or we will reiterate
[00:59.280 --> 01:04.760]  one of our infrastructure automation scripts, we will create a pipeline for change management
[01:04.760 --> 01:13.020]  so that we remove human error altogether. So the Red Team life cycle looks like this.
[01:13.020 --> 01:18.300]  Usually you're doing external reconnaissance and once you compromise a machine you do
[01:18.300 --> 01:25.080]  the same on internally until you get domain admin and domain dominance, and after that usually the
[01:25.080 --> 01:33.960]  company has some target machine and target file to exfiltrate, and until you exfiltrate that file
[01:33.960 --> 01:40.940]  you continue your loop of reconnaissance and remote code execution.
[01:41.900 --> 01:48.260]  The operational models for Red Team are like this. The first one is full scope annotation
[01:48.260 --> 01:54.300]  test. Although it's a controversial one, you can count it as a Red Team because
[01:54.300 --> 01:59.520]  you have black box scope and you're attempting to gain a full tilt into the target and
[01:59.520 --> 02:08.120]  steal the data. So it's really kind of targeted attack and although it provides a
[02:08.120 --> 02:14.100]  data point about the state of a security program, it needs a lot of time and resource.
[02:14.380 --> 02:21.600]  The similar one but the difference is the long-term Red Team operation. It really needs you to work
[02:21.600 --> 02:27.960]  towards the company like you're an actual attacker for a year or two, and it has advantages
[02:28.540 --> 02:36.180]  for the company that you will map their own network and the key divisions so that they can
[02:36.180 --> 02:43.080]  harden their security posture throughout the time. And the hardest part for a Red Team is
[02:43.600 --> 02:51.120]  you have to maintain your access over a long period of time. The other is the classic Red
[02:51.120 --> 02:59.680]  versus Blue wargames. It will strengthen the Blue Team's muscles against the Red Team activities,
[02:59.680 --> 03:08.740]  techniques, and procedures. And also Red Team's muscles against the Blue Team's in terms of
[03:09.190 --> 03:20.990]  they will see how the Blue Team handle their attack etc. So the wargames are usually in a
[03:21.080 --> 03:28.000]  simulated or some scenario-based environment and they are really good to have time to time
[03:28.000 --> 03:35.080]  within the company itself. And the last one is Adversary Simulation. In this Adversary Simulation
[03:35.600 --> 03:42.420]  scenarios, they are based on some APT or mock-up scenario based on a realistic timeline. So you're
[03:42.420 --> 03:49.600]  basically imitating some APT attack or some kind of scenario that the company would like to assess.
[03:49.600 --> 03:57.780]  So the goal is to exercise that attack without getting identified or discovered
[03:57.780 --> 04:05.080]  by Blue Teams. So these events are really good for the company because the states and the scenarios
[04:05.080 --> 04:12.040]  are step by step so they can see where they fail or where they are succeeding at every step.
[04:12.580 --> 04:19.560]  So in the light of our operational models, there is also Operation Security. It's a term from the
[04:19.560 --> 04:27.060]  military and it means that denying adversarial information from the opposite side. So as a Red
[04:27.060 --> 04:34.920]  Teamer, you should be able to deny any information that you may have from the Blue Teamers on the
[04:34.920 --> 04:44.880]  other side. In the wild, some threat actors have operational security mistakes throughout time.
[04:44.880 --> 04:55.000]  And let's look at the recent errors. And the first example is just botnets. They were
[04:55.000 --> 05:01.240]  discovered and doxxed because of not encrypting their C2 servers and the chat sessions.
[05:01.320 --> 05:10.800]  The other example is Forceful. He's a Russian botnet developer and he gave up his C2 server
[05:11.400 --> 05:18.280]  used to carry out the data stacks. So he also got exposed. And the last example is the recent one
[05:18.280 --> 05:25.300]  is IBM researchers found out a huge repository that got exposed due to a security setting
[05:25.300 --> 05:32.880]  misconfiguration. So if you look at these examples, you can clearly see that the security
[05:33.960 --> 05:40.580]  settings misconfiguration and also some error due to the C2 or the other elements
[05:40.580 --> 05:50.000]  that might be have human error is compromising operations. So it happens all the time.
[05:50.100 --> 05:56.540]  So that's why actually we need an automation. Infrastructure design for the standard
[05:56.540 --> 06:02.300]  penetration setup is like this. You usually have some team server and you're directly accessing
[06:02.300 --> 06:09.360]  organization throughout this team server. And usually the team server's assets,
[06:09.360 --> 06:15.880]  IP addresses or domains are white listed by the organization for penetration testing to test
[06:15.880 --> 06:22.300]  more clearly. But in the red teaming setup, you usually have different type of servers
[06:22.300 --> 06:31.240]  behind some kind of redirector, whether it would be traffic or Nginx or HAProxy, etc.
[06:31.240 --> 06:38.460]  So you have multiple servers behind the redirector. The reason for that is you can
[06:38.460 --> 06:46.860]  easily build defense and recover your infrastructure against exposures from the blue team.
[06:46.860 --> 06:57.320]  That's why the redirector approach is more blending into the organization's traffic when
[06:57.320 --> 07:04.780]  it comes to the operational side of things. So when we design our infrastructure, we should be
[07:04.780 --> 07:10.600]  able to consider a few things. The first one is the leanness of our infrastructure.
[07:11.280 --> 07:18.360]  Because we want smooth operation and complicated infrastructure means more maintenance time or
[07:18.360 --> 07:25.420]  more time to... more things to look at. So it's the last thing that we don't want.
[07:25.420 --> 07:31.920]  The second is the segmentation. Your tools of choice must be segmented according to
[07:31.920 --> 07:43.220]  the functionality and they must be accessed from the redirector, not to be directed directly
[07:43.920 --> 07:53.060]  because of the exposure risk from the blue teamers. So these three gain us interdependence
[07:53.700 --> 07:59.020]  for our infrastructure. Every part... since every part is interdependent from each other, if
[07:59.020 --> 08:05.900]  some of them... some of it got exposed, we can quickly destroy and set up again a new one.
[08:05.900 --> 08:11.120]  And the network footprint from the redirector is a thing to be considered so that we can
[08:11.120 --> 08:18.180]  imitate some real traffic from our redirector to the company. So blue teams are... don't find us
[08:18.180 --> 08:26.580]  that easily. And also the engagement domain is also important. Whenever we plan an operation
[08:26.580 --> 08:33.440]  to a company, we always choose a domain from that company's domain category. We... at least
[08:33.440 --> 08:41.460]  you should try to select it that way. And the payload and C2 specifications are important
[08:41.460 --> 08:51.260]  for the... for the opposite side because you should... you can... it's... is your operation
[08:51.260 --> 08:58.420]  more truly if you will use phishing. You should be able to consider the... the three elements
[08:59.220 --> 09:05.400]  when you design your infrastructure because you might need then some SFTP mail server or
[09:05.400 --> 09:12.040]  maybe third-party service if you will use. Also domain fronting if you will use it as well.
[09:12.040 --> 09:19.200]  You should consider it once you... you're starting your design. And last is access control and request
[09:19.200 --> 09:24.880]  processing. The... you should always exercise access control for your infrastructure and more
[09:24.880 --> 09:31.940]  importantly the request processing because if you have some kind of request processing mechanism on
[09:31.940 --> 09:39.720]  the redirectors, you can relay the blue team that if... if your redirector IP or domain got discovered
[09:39.720 --> 09:48.240]  and blue team were to visit that domain or IP address or were to mmap or some discovery...
[09:48.240 --> 09:57.060]  run some discovery, you can redirect them to or relay them to the actual domain or the website so
[09:57.060 --> 10:05.700]  that you can avoid their work, discovery work. So this... the last... the last thing is the
[10:05.700 --> 10:09.380]  one of the most important things when you consider your design.
[10:09.980 --> 10:15.820]  So let's look at the DevOps side of things. DevOps is mainly... is creating self-service
[10:15.820 --> 10:23.880]  infrastructure for teams and since it's self-service and contains automation, it
[10:24.540 --> 10:30.980]  removes human element and which also removes manual and slow procedures so that you can
[10:30.980 --> 10:40.540]  continue work continuously and more lean... lean and fast. So why we are using automation? For every
[10:40.540 --> 10:46.860]  red team engagement, since they... all engagements are most... most of the time are unique,
[10:46.860 --> 10:53.880]  you have to plan and design an infrastructure and you need to set up it from UI or some best
[10:53.880 --> 11:00.480]  script or some kind of means. And if you do that by hand for every engagement, it becomes more
[11:00.480 --> 11:08.380]  error-prone and slow procedures and it's boring. So we're using automation. So for automation,
[11:08.380 --> 11:15.660]  using infrastructure as code, it's... it helps you to define provision and manage your infrastructure.
[11:15.660 --> 11:22.400]  So if you codify all of your infrastructure, you can track version changes or also validate
[11:22.400 --> 11:28.300]  changes and remove human elements from the infrastructure deployment procedure because...
[11:28.300 --> 11:34.020]  because it will be done from the code that you're writing. So the provisioning and deployment
[11:34.020 --> 11:42.480]  process automation is key in here. For infrastructure as code, there are a couple of tools.
[11:42.520 --> 11:49.660]  Chef, Puppet, Ansible, Sustack, CloudFormation and Terraform are common tools for DevOps practices.
[11:50.400 --> 11:56.360]  But why we are using Terraform? Terraform and... except Terraform and CloudFormation, the other
[11:56.360 --> 12:01.500]  tools are mainly configuration management tools and designed to manage existing software.
[12:01.500 --> 12:07.240]  But CloudFormation and Terraform are provisioning tools. Although they have some little degree of
[12:07.240 --> 12:13.560]  configuration management capabilities, they are mainly... are for provisioning the infrastructure itself.
[12:14.020 --> 12:21.100]  And as we said, as the other slide states, CloudFormation is mainly for AWS. So Terraform
[12:21.100 --> 12:30.020]  has multiple cloud support, so we are using Terraform for it. And if you couple Docker with Terraform
[12:30.920 --> 12:38.780]  for the configuration management, all of your needs will resolve itself anyway.
[12:39.340 --> 12:46.760]  And also why we are using Terraform is also have another reason. In DevOps, there is a term called
[12:46.760 --> 12:55.100]  configuration drift. It happens when you maintain an infrastructure over a long period of time.
[12:55.100 --> 13:02.260]  They differentiate one another in terms of the software version, etc. So if you're using
[13:02.260 --> 13:08.800]  Terraform with Docker, you reduce the likelihood of these differences, because the every change
[13:08.800 --> 13:15.440]  is actually mean a new deployment. And another thing is Terraform have declarative style of
[13:15.440 --> 13:23.220]  coding. We can give an example for that is if you have 10 instances and you have to automate it
[13:23.220 --> 13:30.180]  in Ansible, you're automating it like this and in Terraform like this. But if you will have one
[13:30.180 --> 13:39.700]  more instance on top of that 10, you have to rewrite your Ansible to count is as 1. And on
[13:39.700 --> 13:48.200]  the other end, in Terraform, it's 11, and Terraform will take care of the mathematics itself. So you
[13:48.200 --> 13:54.820]  don't have to write new Ansible, you don't need to write useless scripts anymore, because
[13:54.820 --> 14:01.790]  with Terraform, you can update your existing one and it will keep working.
[14:02.400 --> 14:08.760]  So the last part is the samples require master and agent to operate.
[14:09.020 --> 14:15.960]  But Terraform is masterless because it directly talks with APIs, so that it removes the need of
[14:15.960 --> 14:22.460]  master, server, and agent, which will make our infrastructure more lean.
[14:22.980 --> 14:29.100]  And one of the design considerations that we have is the leanness, so Terraform
[14:29.100 --> 14:37.020]  also suits with it. And in the light of this, we have three common combinations when it comes to
[14:37.020 --> 14:40.920]  build and automate our infrastructure. The first one is Terraform and Ansible
[14:42.060 --> 14:49.120]  for provisioning and configuration management. But as I said, for Red Team operation, we need
[14:50.040 --> 14:56.160]  as lean as possible, so Ansible might not work in that scenario, because it requires master,
[14:56.160 --> 15:04.340]  server, and also some agents to operate. The other is, if you're using some virtual machines,
[15:04.340 --> 15:10.880]  you can use Packer to template your virtual machines and deploy it with Terraform.
[15:10.920 --> 15:18.260]  And the last and our recommended approach is using Docker and Kubernetes to orchestrate your
[15:18.260 --> 15:24.540]  infrastructure assets and deploy them with Terraform, so that everything is to be taken
[15:24.540 --> 15:34.220]  care of with Kubernetes. And also, nearly all of the cloud providers have managed Kubernetes
[15:34.220 --> 15:41.020]  service, so you can also take advantage of that and leave the management site for Kubernetes to
[15:41.020 --> 15:48.100]  them. And you can just deploy your Docker with the configurations that you want and
[15:48.100 --> 15:57.730]  leave the management site to them, so that you can operate more easily. And let's get the demo.
[15:58.580 --> 16:03.100]  Hi everyone, this is Caglar. Today we are going to demonstrate building a Red Team
[16:03.100 --> 16:08.580]  operation infrastructure in AWS. We choose AWS, but you are free to use other platforms as like
[16:08.580 --> 16:16.940]  Alibaba or Google Cloud Platform or your own private cloud. Our demo has four phases. Inspecting
[16:16.940 --> 16:22.840]  scripts, run scripts and build infrastructure, payload execution, and destroy. In our demo,
[16:22.840 --> 16:28.700]  we just create a simple network topology, but you can extend it with your needs or requirements.
[16:30.800 --> 16:36.980]  Let's take a quick look to our Terraform and Helm scripts. In our repo, we have four important
[16:36.980 --> 16:47.580]  Terraform scripts. The first one, vpc.tf file, lets you provision a logically isolated section
[16:47.580 --> 16:52.640]  of the AWS cloud, where you can launch AWS resources in a virtual network that you define.
[16:52.640 --> 16:57.320]  You have complete control over your virtual networking environment, including selection
[16:57.320 --> 17:04.420]  of your own IP address range, creation of subnets, and configuration of root tables and network
[17:04.420 --> 17:13.680]  gateways. Aks cluster file lets you create a Kubernetes cluster with one worker. In our demo,
[17:13.680 --> 17:25.560]  we deploy Nginx, Ingress, and Metasploit managed with Aks. Helm file is used to deploy Nginx and
[17:25.560 --> 17:35.060]  Metasploit to worker. The last one is security group. This file is used for access control of
[17:35.060 --> 17:44.060]  communication server and services. And also, we have a file to see outputs of our scripts for
[17:44.060 --> 17:54.300]  debugging. Let's start building our infrastructure with Terraform. First of all, if you are not using
[17:54.340 --> 17:59.600]  a root account, you have to create a user and attach a policy to access AWS services.
[18:16.370 --> 18:20.710]  After user creation, you should add credentials to your profile.
[18:21.190 --> 18:58.690]  Let's initialize our script and validate. It seems everything is fine. Just deploy it.
[19:29.860 --> 20:10.380]  It seems everything is ready now. Let's check status of Kubernetes classes, nodes, and ports.
[20:23.390 --> 20:44.650]  Let's execute our payload. Now, we are ready to destroy all infrastructure.
[20:58.060 --> 21:01.020]  Everything is clear now. Thanks for listening.
[21:03.180 --> 21:10.260]  Now, let's get the maintenance part. So, as you see, we are to validate, deploy, and destroy
[21:10.760 --> 21:18.400]  at every stage of our testing. So, we can automate it with GitLab CI. And for it,
[21:18.400 --> 21:26.220]  we use init and validate first to test our scripts head check. And after that, we are using
[21:26.220 --> 21:35.240]  apply to deploy it. And if we were to deploy, we should be able to get the state so that we can
[21:35.240 --> 21:41.200]  destroy it afterwards. And after deployment, the last part is the destroy, but it should always
[21:41.200 --> 21:53.380]  work in the pipeline because if you were having some problems with deployment, some of your assets
[21:53.380 --> 21:58.540]  would get deployed, but some of them are not. And if the destroy stage does not work, these
[21:59.120 --> 22:08.820]  deployed assets will be online all of the time. So, we should use destroy part always in our CI,
[22:08.820 --> 22:16.220]  always in our CI with the destroy command. And it will destroy our infrastructure after testing
[22:16.220 --> 22:25.420]  so that we can validate our codes. And we can also see if our infrastructure got deployed
[22:26.100 --> 22:33.500]  correct or not. So, main takeaways from our research and the work is choosing correct tools
[22:34.060 --> 22:42.540]  whether it would be some engagement tools or automation tools. You should always choose
[22:43.100 --> 22:51.840]  whichever you are comfortable with. And for every red team engagement,
[22:51.840 --> 22:58.240]  whether you're doing some long-term penetration testing or red teaming or adversary simulation,
[22:58.240 --> 23:05.920]  do not make your infrastructure complex. Try a lean and interdependent infrastructure.
[23:05.920 --> 23:13.160]  Try to build and the domain choice of the things as we mentioned before. Do not make your
[23:13.160 --> 23:20.240]  infrastructure complex so that you can manage it easily. And at last, use automation and CI
[23:20.240 --> 23:25.980]  so that you can test and validate all of your infrastructure so that you don't have any
[23:25.980 --> 23:33.380]  surprises during the engagement itself. If you have any more, if you have any questions,
[23:33.380 --> 23:38.380]  we will be on Discord throughout the day. You can ask it. And as I've said, we will
[23:39.260 --> 23:44.600]  share the link of the slides in the Discord as well so that you can also reach GitHub page.
